Corpora and Translation: Uses and Future Prospects
نویسندگان
چکیده
Although corpora have been an object of study for some decades, the nineteen eighties saw an increased interest in their use and construction. With this increased interest and awareness has come an expansion in the application areas for which corpus based approaches have been deemed relevant. This paper will seek to define the concept of a corpus, and discuss its relevance to two application areas in particular, automatic and manual translation.
منابع مشابه
A new model for persian multi-part words edition based on statistical machine translation
Multi-part words in English language are hyphenated and hyphen is used to separate different parts. Persian language consists of multi-part words as well. Based on Persian morphology, half-space character is needed to separate parts of multi-part words where in many cases people incorrectly use space character instead of half-space character. This common incorrectly use of space leads to some s...
متن کاملاستخراج پیکره موازی از اسناد قابلمقایسه برای بهبود کیفیت ترجمه در سیستمهای ترجمه ماشینی
Data used for training statistical machine translation method are usually prepared from three resources: parallel, non-parallel and comparable text corpora. Parallel corpora are an ideal resource for translation but due to lack of these kinds of texts, non-parallel and comparable corpora are used either for parallel text extraction. Most of existing methods for exploiting comparable corpora loo...
متن کاملCorpus-Centered Computation
To achieve translation technology that is adequate for speech-to-speech translation (S2S), this paper introduces a new attempt named Corpus-Centered Computation, (abbreviated to C and pronounced c-cube). As opposed to conventional approaches adopted by machine translation systems for written language, C places corpora at the center of the technology. For example, translation knowledge is extrac...
متن کاملSmall Hydro-Power Plants in Kenya: A Review of Status, Challenges and Future Prospects
Small Hydro-power Plants (SHP) are an important source of electricity in many countries. However, little is known about SHP in Kenya. This paper reviews the status, challenges in implementation of SHP and prospects for future development of SHP in Kenya. The paper shows that SHP has not yet fully utilized the available hydro-power potential. The challenges associated with SHP development should...
متن کاملDomain Adaptation for Statistical Machine Translation with Domain Dictionary and Monolingual Corpora
tra Statistical machine translation systems are usually trained on large amounts of bilingual text and monolingual text. In this paper, we propose a method to perform domain adaptation for statistical machine translation, where in-domain bilingual corpora do not exist. This method first uses out-of-domain corpora to train a baseline system and then uses in-domain translation dictionaries and in...
متن کامل